Rank | Count | Beginning |
---|---|---|
2730 | 2630 | Астероидның |
24504 | 773 | Ул |
7443 | 721 | Бу |
10468 | 684 | Елга |
10817 | 635 | Елганың |
2108 | 385 | Аның |
26190 | 288 | Халык |
28209 | 282 | Шулай |
28493 | 231 | Шул |
19043 | 219 | Округ |
6446 | 199 | Беренче |
1146 | 197 | Алар |
15683 | 176 | Кратерга |
27859 | 176 | Шәһәр |
26721 | 146 | Хәзерге |
16584 | 120 | Ләкин |
6367 | 111 | Бер |
11804 | 111 | Әлеге |
12029 | 110 | Әмма |
13368 | 110 | Иң |
12 | 109 | » |
536 | 107 | Авыл |
26192 | 99 | Халыкара |
14032 | 96 | Казан |
10212 | 92 | Ә |
28760 | 88 | Шуңа |
1818 | 84 | Анда |
22820 | 83 | Татарстан |
2065 | 82 | Аны |
9551 | 82 | Гыйбадәтханә |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV